Generalizing trajectories¶

Binder IPYNB HTML

To reduce the size (number of points) of trajectory objects, we can generalize them, for example, using:

  • Spatial generalization, such as Douglas-Peucker algorithm
  • Temporal generalization by down-sampling, i.e. increasing the time interval between records
  • Spatiotemporal generalization, e.g. using Top-Down Time Ratio algorithm

Documentation

A closely related type of operation is trajectory smoothing which is coverd in a separate notebook.

In [1]:
import pandas as pd
import geopandas as gpd
import movingpandas as mpd
import shapely as shp
import hvplot.pandas 
import matplotlib.pyplot as plt

from geopandas import GeoDataFrame, read_file
from shapely.geometry import Point, LineString, Polygon
from datetime import datetime, timedelta
from holoviews import opts

import warnings
warnings.filterwarnings('ignore')

plot_defaults = {'linewidth':5, 'capstyle':'round', 'figsize':(9,3), 'legend':True}
opts.defaults(opts.Overlay(active_tools=['wheel_zoom'], frame_width=500, frame_height=400))

mpd.show_versions()
MovingPandas 0.10.rc1

SYSTEM INFO
-----------
python     : 3.9.13 | packaged by conda-forge | (main, May 27 2022, 16:50:36) [MSC v.1929 64 bit (AMD64)]
executable : H:\miniconda3\envs\mpd-ex\python.exe
machine    : Windows-10-10.0.19043-SP0

GEOS, GDAL, PROJ INFO
---------------------
GEOS       : None
GEOS lib   : None
GDAL       : 3.5.0
GDAL data dir: None
PROJ       : 9.0.0
PROJ data dir: H:\miniconda3\pkgs\proj-9.0.0-h1cfcee9_1\Library\share\proj

PYTHON DEPENDENCIES
-------------------
geopandas  : 0.10.2
pandas     : 1.4.2
fiona      : 1.8.21
numpy      : 1.22.4
shapely    : 1.8.2
rtree      : 1.0.0
pyproj     : 3.3.1
matplotlib : 3.5.2
mapclassify: 2.4.3
geopy      : 2.2.0
holoviews  : 1.14.9
hvplot     : 0.8.0
geoviews   : 1.9.5
stonesoup  : 0.1b9
In [2]:
gdf = read_file('../data/geolife_small.gpkg')
traj_collection = mpd.TrajectoryCollection(gdf, 'trajectory_id', t='t')
In [3]:
original_traj = traj_collection.trajectories[1]
print(original_traj)
Trajectory 2 (2009-06-29 07:02:25 to 2009-06-29 11:13:12) | Size: 897 | Length: 38764.6m
Bounds: (116.319212, 39.971703, 116.592616, 40.082514)
LINESTRING (116.590957 40.071961, 116.590905 40.072007, 116.590879 40.072027, 116.590915 40.072004, 
In [4]:
original_traj.plot(column='speed', vmax=20, **plot_defaults)
Out[4]:
<AxesSubplot:>

Spatial generalization (DouglasPeuckerGeneralizer)¶

Try different tolerance settings and observe the results in line geometry and therefore also length:

In [5]:
help(mpd.DouglasPeuckerGeneralizer)
Help on class DouglasPeuckerGeneralizer in module movingpandas.trajectory_generalizer:

class DouglasPeuckerGeneralizer(TrajectoryGeneralizer)
 |  DouglasPeuckerGeneralizer(traj)
 |  
 |  Generalizes using Douglas-Peucker algorithm (as implemented in shapely/Geos).
 |  
 |  tolerance : float
 |      Distance tolerance in trajectory CRS units
 |  
 |  References
 |  ----------
 |  * Douglas, D., & Peucker, T. (1973). Algorithms for the reduction of the number
 |    of points required to represent a digitized line or its caricature.
 |    The Canadian Cartographer 10(2), 112–122. doi:10.3138/FM57-6770-U75U-7727.
 |  
 |  Examples
 |  --------
 |  
 |  >>> mpd.DouglasPeuckerGeneralizer(traj).generalize(tolerance=1.0)
 |  
 |  Method resolution order:
 |      DouglasPeuckerGeneralizer
 |      TrajectoryGeneralizer
 |      builtins.object
 |  
 |  Methods inherited from TrajectoryGeneralizer:
 |  
 |  __init__(self, traj)
 |      Create TrajectoryGeneralizer
 |      
 |      Parameters
 |      ----------
 |      traj : Trajectory or TrajectoryCollection
 |  
 |  generalize(self, tolerance)
 |      Generalize the input Trajectory/TrajectoryCollection.
 |      
 |      Parameters
 |      ----------
 |      tolerance : any type
 |          Tolerance threshold, differs by generalizer
 |      
 |      Returns
 |      -------
 |      Trajectory/TrajectoryCollection
 |          Generalized Trajectory or TrajectoryCollection
 |  
 |  ----------------------------------------------------------------------
 |  Data descriptors inherited from TrajectoryGeneralizer:
 |  
 |  __dict__
 |      dictionary for instance variables (if defined)
 |  
 |  __weakref__
 |      list of weak references to the object (if defined)

In [6]:
dp_generalized  = mpd.DouglasPeuckerGeneralizer(original_traj).generalize(tolerance=0.001)
dp_generalized.plot(column='speed', vmax=20, **plot_defaults)
Out[6]:
<AxesSubplot:>
In [7]:
dp_generalized 
Out[7]:
Trajectory 2 (2009-06-29 07:02:25 to 2009-06-29 11:13:12) | Size: 31 | Length: 36921.9m
Bounds: (116.319709, 39.971775, 116.592616, 40.082369)
LINESTRING (116.590957 40.071961, 116.590367 40.073957, 116.590367 40.073957, 116.590367 40.073957, 
In [8]:
print('Original length: %s'%(original_traj.get_length()))
print('Generalized length: %s'%(dp_generalized.get_length()))
Original length: 38764.57548255216
Generalized length: 36921.91845209962

Temporal generalization (MinTimeDeltaGeneralizer)¶

An alternative generalization method is to down-sample the trajectory to ensure a certain time delta between records:

In [9]:
help(mpd.MinTimeDeltaGeneralizer)
Help on class MinTimeDeltaGeneralizer in module movingpandas.trajectory_generalizer:

class MinTimeDeltaGeneralizer(TrajectoryGeneralizer)
 |  MinTimeDeltaGeneralizer(traj)
 |  
 |  Generalizes based on time.
 |  
 |  This generalization ensures that consecutive rows are at least a certain
 |  timedelta apart.
 |  
 |  tolerance : datetime.timedelta
 |      Desired minimum time difference between consecutive rows
 |  
 |  Examples
 |  --------
 |  
 |  >>> mpd.MinTimeDeltaGeneralizer(traj).generalize(tolerance=timedelta(minutes=10))
 |  
 |  Method resolution order:
 |      MinTimeDeltaGeneralizer
 |      TrajectoryGeneralizer
 |      builtins.object
 |  
 |  Methods inherited from TrajectoryGeneralizer:
 |  
 |  __init__(self, traj)
 |      Create TrajectoryGeneralizer
 |      
 |      Parameters
 |      ----------
 |      traj : Trajectory or TrajectoryCollection
 |  
 |  generalize(self, tolerance)
 |      Generalize the input Trajectory/TrajectoryCollection.
 |      
 |      Parameters
 |      ----------
 |      tolerance : any type
 |          Tolerance threshold, differs by generalizer
 |      
 |      Returns
 |      -------
 |      Trajectory/TrajectoryCollection
 |          Generalized Trajectory or TrajectoryCollection
 |  
 |  ----------------------------------------------------------------------
 |  Data descriptors inherited from TrajectoryGeneralizer:
 |  
 |  __dict__
 |      dictionary for instance variables (if defined)
 |  
 |  __weakref__
 |      list of weak references to the object (if defined)

In [10]:
time_generalized = mpd.MinTimeDeltaGeneralizer(original_traj).generalize(tolerance=timedelta(minutes=1))
time_generalized.plot(column='speed', vmax=20, **plot_defaults)
Out[10]:
<AxesSubplot:>
In [11]:
time_generalized.to_point_gdf().head(10)
Out[11]:
id sequence trajectory_id tracker geometry
t
2009-06-29 07:02:25 1556 1090 2 0 POINT (116.59096 40.07196)
2009-06-29 07:03:25 1569 1103 2 0 POINT (116.59069 40.07225)
2009-06-29 07:04:25 1582 1116 2 0 POINT (116.59037 40.07396)
2009-06-29 07:05:25 1595 1129 2 0 POINT (116.59260 40.07411)
2009-06-29 07:06:25 1610 1144 2 0 POINT (116.59258 40.07420)
2009-06-29 07:07:25 1623 1157 2 0 POINT (116.59235 40.07602)
2009-06-29 07:08:25 1635 1169 2 0 POINT (116.58939 40.07794)
2009-06-29 07:09:25 1647 1181 2 0 POINT (116.58911 40.08171)
2009-06-29 07:10:25 1659 1193 2 0 POINT (116.58829 40.08232)
2009-06-29 07:11:25 1672 1206 2 0 POINT (116.58689 40.08230)
In [12]:
original_traj.to_point_gdf().head(10)
Out[12]:
id sequence trajectory_id tracker geometry
t
2009-06-29 07:02:25 1556 1090 2 0 POINT (116.59096 40.07196)
2009-06-29 07:02:30 1557 1091 2 0 POINT (116.59091 40.07201)
2009-06-29 07:02:35 1558 1092 2 0 POINT (116.59088 40.07203)
2009-06-29 07:02:40 1559 1093 2 0 POINT (116.59091 40.07200)
2009-06-29 07:02:45 1560 1094 2 0 POINT (116.59096 40.07198)
2009-06-29 07:02:50 1561 1095 2 0 POINT (116.59101 40.07196)
2009-06-29 07:02:55 1562 1096 2 0 POINT (116.59099 40.07198)
2009-06-29 07:03:00 1563 1097 2 0 POINT (116.59098 40.07199)
2009-06-29 07:03:05 1564 1098 2 0 POINT (116.59097 40.07200)
2009-06-29 07:03:10 1565 1099 2 0 POINT (116.59097 40.07200)

Spatiotemporal generalization (TopDownTimeRatioGeneralizer)¶

In [13]:
help(mpd.TopDownTimeRatioGeneralizer)
Help on class TopDownTimeRatioGeneralizer in module movingpandas.trajectory_generalizer:

class TopDownTimeRatioGeneralizer(TrajectoryGeneralizer)
 |  TopDownTimeRatioGeneralizer(traj)
 |  
 |  Generalizes using Top-Down Time Ratio algorithm proposed by Meratnia & de By (2004).
 |  
 |  This is a spatiotemporal trajectory generalization algorithm. Where Douglas-Peucker
 |  simply measures the spatial distance between points and original line geometry,
 |  Top-Down Time Ratio (TDTR) measures the distance between points and their
 |  spatiotemporal projection on the trajectory. These projections are calculated based
 |  on the ratio of travel times between the segment start and end times and the point
 |  time.
 |  
 |  tolerance : float
 |      Distance tolerance (distance returned by shapely Point.distance function)
 |  
 |  References
 |  ----------
 |  * Meratnia, N., & de By, R.A. (2004). Spatiotemporal compression techniques for
 |    moving point objects. In International Conference on Extending Database Technology
 |    (pp. 765-782). Springer, Berlin, Heidelberg.
 |  
 |  Examples
 |  --------
 |  
 |  >>> mpd.TopDownTimeRatioGeneralizer(traj).generalize(tolerance=1.0)
 |  
 |  Method resolution order:
 |      TopDownTimeRatioGeneralizer
 |      TrajectoryGeneralizer
 |      builtins.object
 |  
 |  Methods defined here:
 |  
 |  td_tr(self, df, tolerance)
 |  
 |  ----------------------------------------------------------------------
 |  Methods inherited from TrajectoryGeneralizer:
 |  
 |  __init__(self, traj)
 |      Create TrajectoryGeneralizer
 |      
 |      Parameters
 |      ----------
 |      traj : Trajectory or TrajectoryCollection
 |  
 |  generalize(self, tolerance)
 |      Generalize the input Trajectory/TrajectoryCollection.
 |      
 |      Parameters
 |      ----------
 |      tolerance : any type
 |          Tolerance threshold, differs by generalizer
 |      
 |      Returns
 |      -------
 |      Trajectory/TrajectoryCollection
 |          Generalized Trajectory or TrajectoryCollection
 |  
 |  ----------------------------------------------------------------------
 |  Data descriptors inherited from TrajectoryGeneralizer:
 |  
 |  __dict__
 |      dictionary for instance variables (if defined)
 |  
 |  __weakref__
 |      list of weak references to the object (if defined)

In [14]:
tdtr_generalized = mpd.TopDownTimeRatioGeneralizer(original_traj).generalize(tolerance=0.001)

Let's compare this to the basic Douglas-Peucker result:

In [15]:
fig, axes = plt.subplots(nrows=1, ncols=2, figsize=(19,4))
tdtr_generalized.plot(ax=axes[0], column='speed', vmax=20, **plot_defaults)
dp_generalized.plot(ax=axes[1], column='speed', vmax=20, **plot_defaults)
Out[15]:
<AxesSubplot:>
In [ ]: